Base pair probability estimates improve the prediction accuracy of RNA non-canonical base pairs
نویسندگان
چکیده
Prediction of RNA tertiary structure from sequence is an important problem, but generating accurate structure models for even short sequences remains difficult. Predictions of RNA tertiary structure tend to be least accurate in loop regions, where non-canonical pairs are important for determining the details of structure. Non-canonical pairs can be predicted using a knowledge-based model of structure that scores nucleotide cyclic motifs, or NCMs. In this work, a partition function algorithm is introduced that allows the estimation of base pairing probabilities for both canonical and non-canonical interactions. Pairs that are predicted to be probable are more likely to be found in the true structure than pairs of lower probability. Pair probability estimates can be further improved by predicting the structure conserved across multiple homologous sequences using the TurboFold algorithm. These pairing probabilities, used in concert with prior knowledge of the canonical secondary structure, allow accurate inference of non-canonical pairs, an important step towards accurate prediction of the full tertiary structure. Software to predict non-canonical base pairs and pairing probabilities is now provided as part of the RNAstructure software package.
منابع مشابه
Technical Report: An n-free-passes CYK algorithm for error-correction and the prediction of non-canonical base-pairs in RNA secondary structure
Background: The prediction of non-canonical base-pairs in RNA secondary structure prediction has become increasingly important with the advent of next-generation sequencing technologies, where sequencing errors can introduce artificial non-canonical base-pairs in RNA secondary structure. These base-pairs are not appropriately accounted for by the currently existing models. Results: Here we focu...
متن کاملIntroduction to Computational Biology Lecture # 32: RNA secondary structure predication
Today we are going to learn about prediction of RNA folding. RNA is a polymer of four types of nucleotides subunits ACGU. C-G and A-U form hydrogen bonded base pairs, and are said to be complementary. G-C pairs form three hydrogen bonds and tend to be more stable than the A-U pairs, which form only two. In addition to the canonical A-U and G-C pairs, non canonical pairs can also occur in RNA se...
متن کاملConformational specificity of non-canonical base pairs and higher order structures in nucleic acids: crystal structure database analysis
Non-canonical base pairs, mostly present in the RNA, often play a prominent role towards maintaining their structural diversity. Higher order structures like base triples are also important in defining and stabilizing the tertiary folded structure of RNA. We have developed a new program BPFIND to analyze different types of canonical and non-canonical base pairs and base triples involving at lea...
متن کاملImproved RNA secondary structure prediction by maximizing expected pair accuracy.
Free energy minimization has been the most popular method for RNA secondary structure prediction for decades. It is based on a set of empirical free energy change parameters derived from experiments using a nearest-neighbor model. In this study, a program, MaxExpect, that predicts RNA secondary structure by maximizing the expected base-pair accuracy, is reported. This approach was first pioneer...
متن کاملInverse folding of RNA
The aim of the inverse folding problem for RNA is, given a target structure like e.g. the one depicted in Fig. 1, find a sequence that folds into this structure. In this project we will exclusively focus on the secondary structure. The main driving force behind RNA structure formation is the creation of base pairs similar to the ones observed in the DNA double helical structure. In RNA thymine ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 13 شماره
صفحات -
تاریخ انتشار 2017